Overview

Dataset statistics

Number of variables18
Number of observations37076
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.1 MiB
Average record size in memory144.0 B

Variable types

Numeric10
Categorical8

Alerts

Attempted is highly overall correlated with FullHigh correlation
NumberOfMajors is highly overall correlated with NumberOfUniqueMajorsHigh correlation
NumberOfUniqueMajors is highly overall correlated with NumberOfMajorsHigh correlation
Full is highly overall correlated with AttemptedHigh correlation
Ethnicity is highly overall correlated with IsHispanicHigh correlation
IsHispanic is highly overall correlated with EthnicityHigh correlation
Dev is highly imbalanced (79.0%)Imbalance
Ethnicity is highly imbalanced (54.6%)Imbalance
IsHispanic is highly imbalanced (59.3%)Imbalance
Internet has 15832 (42.7%) zerosZeros
PercentageOfRepeats has 30558 (82.4%) zerosZeros
PercentageOfHistDrop has 25109 (67.7%) zerosZeros
CumGPALast has 602 (1.6%) zerosZeros
PercentageOfAbsence has 30482 (82.2%) zerosZeros

Reproduction

Analysis started2023-09-01 17:49:52.075004
Analysis finished2023-09-01 17:50:10.712586
Duration18.64 seconds
Software versionpandas-profiling v0.0.dev0
Download configurationconfig.json

Variables

StudentID
Real number (ℝ)

Distinct14352
Distinct (%)38.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.5670171 × 109
Minimum1.0100064 × 108
Maximum1.0112756 × 1010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size289.8 KiB
2023-09-01T12:50:10.812229image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1.0100064 × 108
5-th percentile1.970071 × 108
Q15.8000927 × 108
median1.0111108 × 1010
Q31.0112361 × 1010
95-th percentile1.0112742 × 1010
Maximum1.0112756 × 1010
Range1.0011756 × 1010
Interquartile range (IQR)9.532352 × 109

Descriptive statistics

Standard deviation4.654677 × 109
Coefficient of variation (CV)0.70879625
Kurtosis-1.6908924
Mean6.5670171 × 109
Median Absolute Deviation (MAD)1626782.5
Skewness-0.55315131
Sum2.4347872 × 1014
Variance2.1666018 × 1019
MonotonicityNot monotonic
2023-09-01T12:50:10.947774image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
712001867 13
 
< 0.1%
1.011110826 × 101012
 
< 0.1%
1.011110491 × 101012
 
< 0.1%
1.011110424 × 101012
 
< 0.1%
502003442 12
 
< 0.1%
761004119 12
 
< 0.1%
403009141 11
 
< 0.1%
114008764 11
 
< 0.1%
273002269 11
 
< 0.1%
434007061 11
 
< 0.1%
Other values (14342) 36959
99.7%
ValueCountFrequency (%)
101000642 1
 
< 0.1%
101001184 6
< 0.1%
101003030 4
< 0.1%
101004016 2
 
< 0.1%
101008593 1
 
< 0.1%
101008781 1
 
< 0.1%
101009897 5
< 0.1%
102001082 1
 
< 0.1%
102002426 3
< 0.1%
102002757 5
< 0.1%
ValueCountFrequency (%)
1.011275622 × 10101
< 0.1%
1.011275618 × 10101
< 0.1%
1.011275521 × 10101
< 0.1%
1.011275481 × 10101
< 0.1%
1.011275473 × 10101
< 0.1%
1.011275472 × 10101
< 0.1%
1.01127547 × 10101
< 0.1%
1.011275448 × 10101
< 0.1%
1.011275446 × 10101
< 0.1%
1.011275443 × 10101
< 0.1%

Attempted
Real number (ℝ)

Distinct27
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.2913205
Minimum1
Maximum29
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size289.8 KiB
2023-09-01T12:50:11.104025image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q16
median9
Q312
95-th percentile16
Maximum29
Range28
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.1207612
Coefficient of variation (CV)0.44350652
Kurtosis-0.51622739
Mean9.2913205
Median Absolute Deviation (MAD)3
Skewness0.29269749
Sum344485
Variance16.980673
MonotonicityNot monotonic
2023-09-01T12:50:11.225768image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
6 5653
15.2%
12 4267
11.5%
3 3797
10.2%
9 3710
10.0%
8 3426
9.2%
13 3414
9.2%
10 2103
 
5.7%
7 2054
 
5.5%
4 1791
 
4.8%
15 1561
 
4.2%
Other values (17) 5300
14.3%
ValueCountFrequency (%)
1 32
 
0.1%
2 82
 
0.2%
3 3797
10.2%
4 1791
 
4.8%
5 197
 
0.5%
6 5653
15.2%
7 2054
 
5.5%
8 3426
9.2%
9 3710
10.0%
10 2103
 
5.7%
ValueCountFrequency (%)
29 2
 
< 0.1%
26 1
 
< 0.1%
25 8
 
< 0.1%
24 6
 
< 0.1%
23 27
 
0.1%
22 35
 
0.1%
21 74
 
0.2%
20 82
 
0.2%
19 388
1.0%
18 312
0.8%

Full
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.8 KiB
0
25015 
1
12061 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters37076
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 25015
67.5%
1 12061
32.5%

Length

2023-09-01T12:50:11.350805image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-01T12:50:11.475764image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 25015
67.5%
1 12061
32.5%

Most occurring characters

ValueCountFrequency (%)
0 25015
67.5%
1 12061
32.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 37076
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 25015
67.5%
1 12061
32.5%

Most occurring scripts

ValueCountFrequency (%)
Common 37076
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 25015
67.5%
1 12061
32.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 25015
67.5%
1 12061
32.5%

Age
Real number (ℝ)

Distinct60
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24.41833
Minimum14
Maximum73
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size289.8 KiB
2023-09-01T12:50:11.569521image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum14
5-th percentile17
Q119
median21
Q327
95-th percentile44
Maximum73
Range59
Interquartile range (IQR)8

Descriptive statistics

Standard deviation8.8026535
Coefficient of variation (CV)0.36049367
Kurtosis3.362206
Mean24.41833
Median Absolute Deviation (MAD)3
Skewness1.8482997
Sum905334
Variance77.486708
MonotonicityNot monotonic
2023-09-01T12:50:11.710182image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19 5017
13.5%
18 4532
12.2%
20 4396
11.9%
21 3125
 
8.4%
17 2868
 
7.7%
22 1984
 
5.4%
23 1526
 
4.1%
24 1179
 
3.2%
25 987
 
2.7%
26 907
 
2.4%
Other values (50) 10555
28.5%
ValueCountFrequency (%)
14 3
 
< 0.1%
15 175
 
0.5%
16 600
 
1.6%
17 2868
7.7%
18 4532
12.2%
19 5017
13.5%
20 4396
11.9%
21 3125
8.4%
22 1984
 
5.4%
23 1526
 
4.1%
ValueCountFrequency (%)
73 2
 
< 0.1%
72 4
 
< 0.1%
71 3
 
< 0.1%
70 4
 
< 0.1%
69 2
 
< 0.1%
68 6
 
< 0.1%
67 9
< 0.1%
66 6
 
< 0.1%
65 13
< 0.1%
64 16
< 0.1%

Dev
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.8 KiB
0
34448 
1
 
2192
2
 
401
3
 
35

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters37076
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 34448
92.9%
1 2192
 
5.9%
2 401
 
1.1%
3 35
 
0.1%

Length

2023-09-01T12:50:11.850808image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-01T12:50:11.986367image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 34448
92.9%
1 2192
 
5.9%
2 401
 
1.1%
3 35
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 34448
92.9%
1 2192
 
5.9%
2 401
 
1.1%
3 35
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 37076
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 34448
92.9%
1 2192
 
5.9%
2 401
 
1.1%
3 35
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 37076
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 34448
92.9%
1 2192
 
5.9%
2 401
 
1.1%
3 35
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 34448
92.9%
1 2192
 
5.9%
2 401
 
1.1%
3 35
 
0.1%

Internet
Real number (ℝ)

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.3446704
Minimum0
Maximum9
Zeros15832
Zeros (%)42.7%
Negative0
Negative (%)0.0%
Memory size289.8 KiB
2023-09-01T12:50:12.095682image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile4
Maximum9
Range9
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.5564579
Coefficient of variation (CV)1.1575014
Kurtosis0.48912431
Mean1.3446704
Median Absolute Deviation (MAD)1
Skewness1.0962863
Sum49855
Variance2.4225612
MonotonicityNot monotonic
2023-09-01T12:50:12.195055image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0 15832
42.7%
1 7448
20.1%
2 5863
 
15.8%
3 3434
 
9.3%
4 2768
 
7.5%
5 1216
 
3.3%
6 396
 
1.1%
7 103
 
0.3%
8 14
 
< 0.1%
9 2
 
< 0.1%
ValueCountFrequency (%)
0 15832
42.7%
1 7448
20.1%
2 5863
 
15.8%
3 3434
 
9.3%
4 2768
 
7.5%
5 1216
 
3.3%
6 396
 
1.1%
7 103
 
0.3%
8 14
 
< 0.1%
9 2
 
< 0.1%
ValueCountFrequency (%)
9 2
 
< 0.1%
8 14
 
< 0.1%
7 103
 
0.3%
6 396
 
1.1%
5 1216
 
3.3%
4 2768
 
7.5%
3 3434
 
9.3%
2 5863
 
15.8%
1 7448
20.1%
0 15832
42.7%

PercentageOfRepeats
Real number (ℝ)

Distinct38
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.096953554
Minimum0
Maximum1
Zeros30558
Zeros (%)82.4%
Negative0
Negative (%)0.0%
Memory size289.8 KiB
2023-09-01T12:50:12.320047image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0.75
Maximum1
Range1
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.24431757
Coefficient of variation (CV)2.5199444
Kurtosis6.2010217
Mean0.096953554
Median Absolute Deviation (MAD)0
Skewness2.6710229
Sum3594.65
Variance0.059691073
MonotonicityNot monotonic
2023-09-01T12:50:12.585626image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
0 30558
82.4%
1 1550
 
4.2%
0.5 1053
 
2.8%
0.25 788
 
2.1%
0.3333333333 694
 
1.9%
0.2 559
 
1.5%
0.6666666667 461
 
1.2%
0.4 373
 
1.0%
0.1666666667 233
 
0.6%
0.75 193
 
0.5%
Other values (28) 614
 
1.7%
ValueCountFrequency (%)
0 30558
82.4%
0.09090909091 1
 
< 0.1%
0.1 1
 
< 0.1%
0.1111111111 3
 
< 0.1%
0.125 18
 
< 0.1%
0.1428571429 62
 
0.2%
0.1666666667 233
 
0.6%
0.2 559
 
1.5%
0.2222222222 10
 
< 0.1%
0.25 788
 
2.1%
ValueCountFrequency (%)
1 1550
4.2%
0.9090909091 2
 
< 0.1%
0.9 1
 
< 0.1%
0.8888888889 6
 
< 0.1%
0.875 7
 
< 0.1%
0.8571428571 18
 
< 0.1%
0.8333333333 32
 
0.1%
0.8 96
 
0.3%
0.7777777778 3
 
< 0.1%
0.75 193
 
0.5%

PercentageOfHistDrop
Real number (ℝ)

Distinct360
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.05996983
Minimum0
Maximum1
Zeros25109
Zeros (%)67.7%
Negative0
Negative (%)0.0%
Memory size289.8 KiB
2023-09-01T12:50:12.726335image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30.076923077
95-th percentile0.3
Maximum1
Range1
Interquartile range (IQR)0.076923077

Descriptive statistics

Standard deviation0.12968569
Coefficient of variation (CV)2.1625156
Kurtosis18.63201
Mean0.05996983
Median Absolute Deviation (MAD)0
Skewness3.7062475
Sum2223.4414
Variance0.016818379
MonotonicityNot monotonic
2023-09-01T12:50:12.882496image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 25109
67.7%
0.2 591
 
1.6%
0.25 523
 
1.4%
0.1666666667 477
 
1.3%
0.125 430
 
1.2%
0.1428571429 427
 
1.2%
0.1111111111 403
 
1.1%
0.3333333333 399
 
1.1%
0.1 379
 
1.0%
0.5 348
 
0.9%
Other values (350) 7990
 
21.6%
ValueCountFrequency (%)
0 25109
67.7%
0.01333333333 1
 
< 0.1%
0.01428571429 1
 
< 0.1%
0.01515151515 1
 
< 0.1%
0.01754385965 2
 
< 0.1%
0.01818181818 3
 
< 0.1%
0.01886792453 1
 
< 0.1%
0.01923076923 3
 
< 0.1%
0.01960784314 1
 
< 0.1%
0.02 4
 
< 0.1%
ValueCountFrequency (%)
1 221
0.6%
0.8571428571 1
 
< 0.1%
0.8461538462 1
 
< 0.1%
0.8333333333 4
 
< 0.1%
0.8 12
 
< 0.1%
0.75 19
 
0.1%
0.7272727273 1
 
< 0.1%
0.7142857143 7
 
< 0.1%
0.7 1
 
< 0.1%
0.6875 1
 
< 0.1%

TermCode
Categorical

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.8 KiB
B17C
3436 
B20C
3342 
B18C
3336 
B19C
3124 
B22C
3030 
Other values (9)
20808 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters148304
Distinct characters10
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB17C
2nd rowB17C
3rd rowB18Q
4th rowB20Q
5th rowB21Q

Common Values

ValueCountFrequency (%)
B17C 3436
9.3%
B20C 3342
9.0%
B18C 3336
9.0%
B19C 3124
8.4%
B22C 3030
8.2%
B23C 2985
8.1%
B21C 2809
 
7.6%
B17Q 2708
 
7.3%
B18Q 2506
 
6.8%
B19Q 2463
 
6.6%
Other values (4) 7337
19.8%

Length

2023-09-01T12:50:13.018214image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
b17c 3436
9.3%
b20c 3342
9.0%
b18c 3336
9.0%
b19c 3124
8.4%
b22c 3030
8.2%
b23c 2985
8.1%
b21c 2809
 
7.6%
b17q 2708
 
7.3%
b18q 2506
 
6.8%
b19q 2463
 
6.6%
Other values (4) 7337
19.8%

Most occurring characters

ValueCountFrequency (%)
B 37076
25.0%
2 23464
15.8%
1 22548
15.2%
C 22062
14.9%
Q 15014
10.1%
7 6144
 
4.1%
8 5842
 
3.9%
0 5680
 
3.8%
9 5587
 
3.8%
3 4887
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 74152
50.0%
Decimal Number 74152
50.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 23464
31.6%
1 22548
30.4%
7 6144
 
8.3%
8 5842
 
7.9%
0 5680
 
7.7%
9 5587
 
7.5%
3 4887
 
6.6%
Uppercase Letter
ValueCountFrequency (%)
B 37076
50.0%
C 22062
29.8%
Q 15014
20.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 74152
50.0%
Common 74152
50.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 23464
31.6%
1 22548
30.4%
7 6144
 
8.3%
8 5842
 
7.9%
0 5680
 
7.7%
9 5587
 
7.5%
3 4887
 
6.6%
Latin
ValueCountFrequency (%)
B 37076
50.0%
C 22062
29.8%
Q 15014
20.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 148304
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
B 37076
25.0%
2 23464
15.8%
1 22548
15.2%
C 22062
14.9%
Q 15014
10.1%
7 6144
 
4.1%
8 5842
 
3.9%
0 5680
 
3.8%
9 5587
 
3.8%
3 4887
 
3.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.8 KiB
0
31704 
1
5372 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters37076
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 31704
85.5%
1 5372
 
14.5%

Length

2023-09-01T12:50:13.111963image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-01T12:50:13.226294image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 31704
85.5%
1 5372
 
14.5%

Most occurring characters

ValueCountFrequency (%)
0 31704
85.5%
1 5372
 
14.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 37076
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 31704
85.5%
1 5372
 
14.5%

Most occurring scripts

ValueCountFrequency (%)
Common 37076
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 31704
85.5%
1 5372
 
14.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 31704
85.5%
1 5372
 
14.5%

NumberOfMajors
Real number (ℝ)

Distinct30
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.9280127
Minimum1
Maximum36
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size289.8 KiB
2023-09-01T12:50:13.335676image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median5
Q38
95-th percentile13
Maximum36
Range35
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.5113643
Coefficient of variation (CV)0.59233414
Kurtosis2.1119313
Mean5.9280127
Median Absolute Deviation (MAD)2
Skewness1.2665267
Sum219787
Variance12.329679
MonotonicityNot monotonic
2023-09-01T12:50:13.445301image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
4 5827
15.7%
2 5332
14.4%
5 4621
12.5%
3 4335
11.7%
6 3518
9.5%
7 3224
8.7%
8 2520
6.8%
9 1781
 
4.8%
10 1603
 
4.3%
11 1169
 
3.2%
Other values (20) 3146
8.5%
ValueCountFrequency (%)
1 297
 
0.8%
2 5332
14.4%
3 4335
11.7%
4 5827
15.7%
5 4621
12.5%
6 3518
9.5%
7 3224
8.7%
8 2520
6.8%
9 1781
 
4.8%
10 1603
 
4.3%
ValueCountFrequency (%)
36 1
 
< 0.1%
34 1
 
< 0.1%
28 1
 
< 0.1%
27 3
 
< 0.1%
26 5
 
< 0.1%
25 5
 
< 0.1%
24 7
 
< 0.1%
23 10
 
< 0.1%
22 19
0.1%
21 30
0.1%

NumberOfUniqueMajors
Real number (ℝ)

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.603652
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size289.8 KiB
2023-09-01T12:50:13.570223image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile3
Maximum7
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.78204162
Coefficient of variation (CV)0.48766293
Kurtosis1.8685097
Mean1.603652
Median Absolute Deviation (MAD)0
Skewness1.3241148
Sum59457
Variance0.61158909
MonotonicityNot monotonic
2023-09-01T12:50:13.663969image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1 20339
54.9%
2 12158
32.8%
3 3676
 
9.9%
4 770
 
2.1%
5 106
 
0.3%
6 25
 
0.1%
7 2
 
< 0.1%
ValueCountFrequency (%)
1 20339
54.9%
2 12158
32.8%
3 3676
 
9.9%
4 770
 
2.1%
5 106
 
0.3%
6 25
 
0.1%
7 2
 
< 0.1%
ValueCountFrequency (%)
7 2
 
< 0.1%
6 25
 
0.1%
5 106
 
0.3%
4 770
 
2.1%
3 3676
 
9.9%
2 12158
32.8%
1 20339
54.9%

CumGPALast
Real number (ℝ)

Distinct5072
Distinct (%)13.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.951954
Minimum0
Maximum4
Zeros602
Zeros (%)1.6%
Negative0
Negative (%)0.0%
Memory size289.8 KiB
2023-09-01T12:50:13.789009image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.3
Q12.5
median3.0513
Q33.598575
95-th percentile4
Maximum4
Range4
Interquartile range (IQR)1.098575

Descriptive statistics

Standard deviation0.86017982
Coefficient of variation (CV)0.29139336
Kurtosis1.4421768
Mean2.951954
Median Absolute Deviation (MAD)0.5513
Skewness-1.1287391
Sum109446.65
Variance0.73990932
MonotonicityNot monotonic
2023-09-01T12:50:13.914001image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4 4514
 
12.2%
3 2611
 
7.0%
3.5 1193
 
3.2%
2 1185
 
3.2%
2.5 755
 
2.0%
0 602
 
1.6%
3.25 443
 
1.2%
3.75 374
 
1.0%
3.6667 346
 
0.9%
3.3333 340
 
0.9%
Other values (5062) 24713
66.7%
ValueCountFrequency (%)
0 602
1.6%
0.0667 1
 
< 0.1%
0.0702 1
 
< 0.1%
0.087 1
 
< 0.1%
0.1071 1
 
< 0.1%
0.1364 2
 
< 0.1%
0.1429 5
 
< 0.1%
0.15 1
 
< 0.1%
0.1538 2
 
< 0.1%
0.1579 2
 
< 0.1%
ValueCountFrequency (%)
4 4514
12.2%
3.9897 1
 
< 0.1%
3.9804 1
 
< 0.1%
3.9701 1
 
< 0.1%
3.9677 1
 
< 0.1%
3.9672 1
 
< 0.1%
3.9671 1
 
< 0.1%
3.9634 1
 
< 0.1%
3.962 1
 
< 0.1%
3.9615 1
 
< 0.1%

PercentageOfAbsence
Real number (ℝ)

Distinct77
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.073110454
Minimum0
Maximum1
Zeros30482
Zeros (%)82.2%
Negative0
Negative (%)0.0%
Memory size289.8 KiB
2023-09-01T12:50:14.080713image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0.5
Maximum1
Range1
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.19455806
Coefficient of variation (CV)2.6611524
Kurtosis10.430473
Mean0.073110454
Median Absolute Deviation (MAD)0
Skewness3.1967747
Sum2710.6432
Variance0.037852838
MonotonicityNot monotonic
2023-09-01T12:50:14.212052image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 30482
82.2%
0.5 940
 
2.5%
0.3333333333 890
 
2.4%
1 765
 
2.1%
0.25 698
 
1.9%
0.2 533
 
1.4%
0.1666666667 401
 
1.1%
0.1428571429 307
 
0.8%
0.6666666667 233
 
0.6%
0.4 229
 
0.6%
Other values (67) 1598
 
4.3%
ValueCountFrequency (%)
0 30482
82.2%
0.05555555556 2
 
< 0.1%
0.05882352941 2
 
< 0.1%
0.0625 6
 
< 0.1%
0.06666666667 7
 
< 0.1%
0.07142857143 9
 
< 0.1%
0.07692307692 21
 
0.1%
0.08333333333 32
 
0.1%
0.09090909091 58
 
0.2%
0.1 87
 
0.2%
ValueCountFrequency (%)
1 765
2.1%
0.9285714286 1
 
< 0.1%
0.9166666667 1
 
< 0.1%
0.9 1
 
< 0.1%
0.8888888889 5
 
< 0.1%
0.875 8
 
< 0.1%
0.8571428571 16
 
< 0.1%
0.8461538462 1
 
< 0.1%
0.8333333333 25
 
0.1%
0.8125 1
 
< 0.1%

Ethnicity
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.8 KiB
White, Non-Hispanic
25651 
Hispanic
5499 
Black, Non-Hispanic
2920 
American Indian or Alaskan Native
 
1650
Asian
 
726
Other values (5)
 
630

Length

Max length35
Median length19
Mean length17.679604
Min length5

Characters and Unicode

Total characters655489
Distinct characters32
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWhite, Non-Hispanic
2nd rowWhite, Non-Hispanic
3rd rowWhite, Non-Hispanic
4th rowWhite, Non-Hispanic
5th rowWhite, Non-Hispanic

Common Values

ValueCountFrequency (%)
White, Non-Hispanic 25651
69.2%
Hispanic 5499
 
14.8%
Black, Non-Hispanic 2920
 
7.9%
American Indian or Alaskan Native 1650
 
4.5%
Asian 726
 
2.0%
International 188
 
0.5%
unknown 185
 
0.5%
Unknown or Not Reported 174
 
0.5%
Native Hawaiian or Pacific Islander 77
 
0.2%
Not Hispanic or Latino 6
 
< 0.1%

Length

2023-09-01T12:50:14.352647image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-01T12:50:14.493325image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
non-hispanic 28571
39.1%
white 25651
35.1%
hispanic 5505
 
7.5%
black 2920
 
4.0%
or 1907
 
2.6%
native 1727
 
2.4%
american 1650
 
2.3%
indian 1650
 
2.3%
alaskan 1650
 
2.3%
asian 726
 
1.0%
Other values (8) 1138
 
1.6%

Most occurring characters

ValueCountFrequency (%)
i 100058
15.3%
n 71774
 
10.9%
a 46816
 
7.1%
c 38800
 
5.9%
s 36529
 
5.6%
36019
 
5.5%
p 34250
 
5.2%
H 34153
 
5.2%
o 31385
 
4.8%
N 30478
 
4.6%
Other values (22) 195227
29.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 462754
70.6%
Uppercase Letter 99574
 
15.2%
Space Separator 36019
 
5.5%
Dash Punctuation 28571
 
4.4%
Other Punctuation 28571
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 100058
21.6%
n 71774
15.5%
a 46816
10.1%
c 38800
 
8.4%
s 36529
 
7.9%
p 34250
 
7.4%
o 31385
 
6.8%
e 29641
 
6.4%
t 28114
 
6.1%
h 25651
 
5.5%
Other values (9) 19736
 
4.3%
Uppercase Letter
ValueCountFrequency (%)
H 34153
34.3%
N 30478
30.6%
W 25651
25.8%
A 4026
 
4.0%
B 2920
 
2.9%
I 1915
 
1.9%
U 174
 
0.2%
R 174
 
0.2%
P 77
 
0.1%
L 6
 
< 0.1%
Space Separator
ValueCountFrequency (%)
36019
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 28571
100.0%
Other Punctuation
ValueCountFrequency (%)
, 28571
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 562328
85.8%
Common 93161
 
14.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 100058
17.8%
n 71774
12.8%
a 46816
8.3%
c 38800
 
6.9%
s 36529
 
6.5%
p 34250
 
6.1%
H 34153
 
6.1%
o 31385
 
5.6%
N 30478
 
5.4%
e 29641
 
5.3%
Other values (19) 108444
19.3%
Common
ValueCountFrequency (%)
36019
38.7%
- 28571
30.7%
, 28571
30.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 655489
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 100058
15.3%
n 71774
 
10.9%
a 46816
 
7.1%
c 38800
 
5.9%
s 36529
 
5.6%
36019
 
5.5%
p 34250
 
5.2%
H 34153
 
5.2%
o 31385
 
4.8%
N 30478
 
4.6%
Other values (22) 195227
29.8%

Gender
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.8 KiB
Female
23357 
Male
13674 
unknown
 
45

Length

Max length7
Median length6
Mean length5.2635937
Min length4

Characters and Unicode

Total characters195153
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowFemale
3rd rowFemale
4th rowFemale
5th rowFemale

Common Values

ValueCountFrequency (%)
Female 23357
63.0%
Male 13674
36.9%
unknown 45
 
0.1%

Length

2023-09-01T12:50:14.649521image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-01T12:50:14.790217image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
female 23357
63.0%
male 13674
36.9%
unknown 45
 
0.1%

Most occurring characters

ValueCountFrequency (%)
e 60388
30.9%
a 37031
19.0%
l 37031
19.0%
F 23357
 
12.0%
m 23357
 
12.0%
M 13674
 
7.0%
n 135
 
0.1%
u 45
 
< 0.1%
k 45
 
< 0.1%
o 45
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 158122
81.0%
Uppercase Letter 37031
 
19.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 60388
38.2%
a 37031
23.4%
l 37031
23.4%
m 23357
 
14.8%
n 135
 
0.1%
u 45
 
< 0.1%
k 45
 
< 0.1%
o 45
 
< 0.1%
w 45
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
F 23357
63.1%
M 13674
36.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 195153
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 60388
30.9%
a 37031
19.0%
l 37031
19.0%
F 23357
 
12.0%
m 23357
 
12.0%
M 13674
 
7.0%
n 135
 
0.1%
u 45
 
< 0.1%
k 45
 
< 0.1%
o 45
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 195153
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 60388
30.9%
a 37031
19.0%
l 37031
19.0%
F 23357
 
12.0%
m 23357
 
12.0%
M 13674
 
7.0%
n 135
 
0.1%
u 45
 
< 0.1%
k 45
 
< 0.1%
o 45
 
< 0.1%

IsHispanic
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.8 KiB
No
31370 
Yes
5561 
unknown
 
145

Length

Max length7
Median length2
Mean length2.1695436
Min length2

Characters and Unicode

Total characters80438
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 31370
84.6%
Yes 5561
 
15.0%
unknown 145
 
0.4%

Length

2023-09-01T12:50:14.946397image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-01T12:50:15.066609image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
no 31370
84.6%
yes 5561
 
15.0%
unknown 145
 
0.4%

Most occurring characters

ValueCountFrequency (%)
o 31515
39.2%
N 31370
39.0%
Y 5561
 
6.9%
e 5561
 
6.9%
s 5561
 
6.9%
n 435
 
0.5%
u 145
 
0.2%
k 145
 
0.2%
w 145
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 43507
54.1%
Uppercase Letter 36931
45.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 31515
72.4%
e 5561
 
12.8%
s 5561
 
12.8%
n 435
 
1.0%
u 145
 
0.3%
k 145
 
0.3%
w 145
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
N 31370
84.9%
Y 5561
 
15.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 80438
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 31515
39.2%
N 31370
39.0%
Y 5561
 
6.9%
e 5561
 
6.9%
s 5561
 
6.9%
n 435
 
0.5%
u 145
 
0.2%
k 145
 
0.2%
w 145
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 80438
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 31515
39.2%
N 31370
39.0%
Y 5561
 
6.9%
e 5561
 
6.9%
s 5561
 
6.9%
n 435
 
0.5%
u 145
 
0.2%
k 145
 
0.2%
w 145
 
0.2%

Target
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size289.8 KiB
0
32172 
1
4904 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters37076
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 32172
86.8%
1 4904
 
13.2%

Length

2023-09-01T12:50:15.194037image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-01T12:50:15.306008image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 32172
86.8%
1 4904
 
13.2%

Most occurring characters

ValueCountFrequency (%)
0 32172
86.8%
1 4904
 
13.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 37076
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 32172
86.8%
1 4904
 
13.2%

Most occurring scripts

ValueCountFrequency (%)
Common 37076
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 32172
86.8%
1 4904
 
13.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 32172
86.8%
1 4904
 
13.2%

Interactions

2023-09-01T12:50:08.463826image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:55.240236image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:56.689747image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:58.088281image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:59.620957image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:01.109456image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:02.462337image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:03.884815image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:05.359872image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:06.807094image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:08.667805image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:55.410507image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:56.831146image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:58.277938image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:59.862537image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:01.245940image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:02.593489image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:04.006321image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:05.512502image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:06.947408image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:08.832117image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:55.593374image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:56.978186image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:58.439648image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:59.977349image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:01.378841image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:02.726557image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:04.143667image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:05.663259image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:07.213874image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:08.960808image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:55.710403image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:57.155036image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:58.561351image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:00.118155image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:01.511646image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:02.879242image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:04.278960image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:05.799012image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:07.347340image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:09.079016image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:55.827631image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:57.278187image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:58.693074image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:00.275898image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:01.645605image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:03.008838image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:04.412788image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:05.918930image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:07.523062image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:09.209358image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:55.993777image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:57.410471image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:58.829012image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:00.426852image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:01.779718image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:03.179199image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:04.583594image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:06.072906image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:07.708471image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:09.365548image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:56.128043image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:57.544146image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:58.978242image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:00.545394image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:01.905535image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:03.319050image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:04.745790image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:06.204107image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:07.831743image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:09.497306image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:56.263106image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:57.676895image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:59.159922image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:00.695337image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:02.059599image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:03.462263image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:04.910348image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:06.360835image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:08.013684image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:09.660055image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:56.394486image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:57.794182image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:59.311777image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:00.830944image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:02.199454image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:03.613095image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:05.067521image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:06.497317image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:08.162293image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:09.804478image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:56.569181image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:57.944288image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:49:59.487592image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:00.962067image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:02.340472image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:03.748124image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:05.198922image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:06.659285image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-09-01T12:50:08.297089image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Correlations

2023-09-01T12:50:15.415342image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
StudentIDAttemptedAgeInternetPercentageOfRepeatsPercentageOfHistDropNumberOfMajorsNumberOfUniqueMajorsCumGPALastPercentageOfAbsenceFullDevTermCodeMajorChangedFromLastEthnicityGenderIsHispanicTarget
StudentID1.000-0.070-0.4650.061-0.149-0.256-0.258-0.1670.1370.0450.0140.0030.4080.0990.0710.0760.0510.004
Attempted-0.0701.0000.1630.3840.0640.0490.1260.162-0.0330.0290.7750.1050.0660.1080.0360.0420.0280.135
Age-0.4650.1631.000-0.0050.1650.2650.3770.340-0.1320.0060.1430.0330.0280.1070.0740.0840.0910.054
Internet0.0610.384-0.0051.0000.0720.0620.0880.063-0.0210.0680.4010.0420.0770.0480.0160.0310.0160.138
PercentageOfRepeats-0.1490.0640.1650.0721.0000.3240.1400.062-0.3420.1230.2190.1340.0360.0410.0270.0060.0280.129
PercentageOfHistDrop-0.2560.0490.2650.0620.3241.0000.2870.219-0.2760.0750.0210.0760.0250.0850.0170.0060.0080.117
NumberOfMajors-0.2580.1260.3770.0880.1400.2871.0000.517-0.058-0.0060.1190.0550.1040.0790.0220.0650.0330.040
NumberOfUniqueMajors-0.1670.1620.3400.0630.0620.2190.5171.000-0.1080.0060.1240.0220.1170.4580.0260.0400.0230.021
CumGPALast0.137-0.033-0.132-0.021-0.342-0.276-0.058-0.1081.000-0.1890.0550.1720.0290.1030.0550.0540.0460.146
PercentageOfAbsence0.0450.0290.0060.0680.1230.075-0.0060.006-0.1891.0000.1090.0650.0460.0090.0230.0470.0120.122
Full0.0140.7750.1430.4010.2190.0210.1190.1240.0550.1091.0000.0710.0870.0720.0780.0240.0210.166
Dev0.0030.1050.0330.0420.1340.0760.0550.0220.1720.0650.0711.0000.0560.0590.0530.0000.0140.104
TermCode0.4080.0660.0280.0770.0360.0250.1040.1170.0290.0460.0870.0561.0000.1220.0350.0490.0520.109
MajorChangedFromLast0.0990.1080.1070.0480.0410.0850.0790.4580.1030.0090.0720.0590.1221.0000.0190.0170.0120.041
Ethnicity0.0710.0360.0740.0160.0270.0170.0220.0260.0550.0230.0780.0530.0350.0191.0000.1040.6950.030
Gender0.0760.0420.0840.0310.0060.0060.0650.0400.0540.0470.0240.0000.0490.0170.1041.0000.0060.011
IsHispanic0.0510.0280.0910.0160.0280.0080.0330.0230.0460.0120.0210.0140.0520.0120.6950.0061.0000.007
Target0.0040.1350.0540.1380.1290.1170.0400.0210.1460.1220.1660.1040.1090.0410.0300.0110.0071.000

Missing values

2023-09-01T12:50:10.027177image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-09-01T12:50:10.459899image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

StudentIDAttemptedFullAgeDevInternetPercentageOfRepeatsPercentageOfHistDropTermCodeMajorChangedFromLastNumberOfMajorsNumberOfUniqueMajorsCumGPALastPercentageOfAbsenceEthnicityGenderIsHispanicTarget
01010006426.0026.0020.000.142857B17C0511.68570.000000White, Non-HispanicFemaleNo0
110100118416.0127.0000.250.000000B17C0312.60000.000000White, Non-HispanicFemaleNo0
210100118410.0028.0100.000.000000B18Q1522.42860.000000White, Non-HispanicFemaleNo0
31010011846.0030.0020.000.000000B20Q1732.13460.000000White, Non-HispanicFemaleNo0
410100118412.0131.0000.000.000000B21Q01031.91380.000000White, Non-HispanicFemaleNo0
51010011846.0032.0010.000.000000B22Q11342.10001.000000White, Non-HispanicFemaleNo0
61010011843.0033.0000.000.000000B23C11452.17110.333333White, Non-HispanicFemaleNo0
71010098978.0020.0000.000.000000B17C0413.69700.000000White, Non-HispanicFemaleNo0
81010098978.0020.0001.000.000000B17Q0613.26830.000000White, Non-HispanicFemaleNo0
91010098977.0021.0010.800.000000B18Q0913.46340.000000White, Non-HispanicFemaleNo0
StudentIDAttemptedFullAgeDevInternetPercentageOfRepeatsPercentageOfHistDropTermCodeMajorChangedFromLastNumberOfMajorsNumberOfUniqueMajorsCumGPALastPercentageOfAbsenceEthnicityGenderIsHispanicTarget
370661011275442911.0122.0040.0000000.25B23Q0313.28570.0Black, Non-HispanicMaleNo0
37067101127544659.0121.0010.6666670.00B23Q0310.75000.5AsianFemaleNo0
37068101127544838.0020.0110.0000000.00B23Q0314.00000.0White, Non-HispanicMaleNo0
37069101127547004.0036.0001.0000000.25B23Q0310.66670.5White, Non-HispanicFemaleNo0
37070101127547207.0025.0000.3333330.00B23Q0310.00000.0White, Non-HispanicFemaleNo0
37071101127547319.0134.0020.3333330.25B23Q0312.00000.5HispanicMaleYes0
37072101127548144.0031.0000.0000000.00B23Q0314.00000.0American Indian or Alaskan NativeFemaleNo0
37073101127552074.0023.0000.0000000.00B23Q0313.20000.0HispanicMaleYes0
370741011275617712.0126.0010.0000000.00B23Q0313.62500.0American Indian or Alaskan NativeFemaleNo0
37075101127562214.0017.0000.0000000.00B23Q0313.07690.0HispanicMaleYes0